8.2-4

Let $A[1...n]$ be input integers in the range $0$ to $k$.

Design:
Preprocess($A$):
  Create array $R[-1...k]$ which all values are $0$
  for $i$ from $1$ to $n$:
    $R[ A[i] ] := R[ A[i] ] + 1$

  for $j$ from $0$ to $k$:
    $R[j] := R[j-1] + R[j]$

  Return $R$


$c$ $:=$ Preprocess($A$)
Query($a$, $b$):
  Return $c[b] - c[a-1]$

Correctness:
Firstly, we are gonna consider Preprocess. Observe that the first for-loop will never access $R$ out of range, since $0 \leq A[i] \leq k$ for all $1 \leq i \leq n$.
When the first for-loop terminates, it's easy to see that $R[j]$ records how many elements in $A$ equals to $j$.
To complete the proof, we have to prove $R[j] = \sharp \{ A[i] : A[i] \leq j \}$.   $(*)$
Because $(*)$ is not so obvious, we are gonna use loop invariant to prove it.
Loop Invariant: At the start of the $l$-th iteration, $(*)$ is true for all $-1 \leq j < l$.
Initialization: When the first iteration starts, $R[-1]$ is $0$, so loop invariant holds.
Maintenance:
  Suppose loop invariant holds when the $l$-th iteration starts. Observe that $R[j]$ is unmodified for all $l \leq j \leq k$ until the start of current iteration.
  It implies that $R[l]$ stores how many $l$'s in $A[1...n]$. Therefore, when the current iteration ends, $R[l]$ equals to $\sharp \{ A[i] : A[i] \leq l \}$, and $(*)$ is true for all $-1 \leq j \leq l$.
Termination: As our discussion in previous part, when the final iteration ends, $(*)$ is true for all $-1 \leq j \leq k$.

The correctness of Query follows immediately from $(*)$.

Analysis:

  1. Preprocess:
     (i) The construction and initialization for $R[-1...k]$ consumes $(k+2)t_1$ time, where $t_1$ is a constant.
     (ii) The first for-loop iterates $n$ times, it takes constant time operation in each iteration and totally consumes $n t_2$ time for some constant $t_2$.
     (iii) The second for-loop iterates $k+1$ times, it takes constant time operation in each iteration and totally consumes $(k+1) t_3$ time for some constant $t_3$.
     (iv) Return operation consumes constant time $t_4$.

 Hence, the time consumption of Preprocess is $T(n,k) = (k+2)t_1 + n t_2 + (k+1)t_3 + t_4$.
 And we have $\min\{ t_1, t_2, t_3\} \cdot (n + k) \leq T(n,k) \leq 2\max\{ t_1, t_2, t_3\} \cdot (n + k)$. Therefore, $T(n,k) = \Theta(n+k)$.

  1. Query:
     Clearly, its time complexity is $\Theta(1)$.


8-5

a.
If $k=1$, then $k$-sorted implies that $$A[i] = \sum_{j=i}^{i+1-1}{A[j]} \leq \sum_{j=i+1}^{i+1}{A[j]} = A[i+1]$$ for all $i = 1, 2, ..., n-1$. $\implies$ $A$ is sorted.

b.
$1$, $6$, $2$, $7$, $3$, $8$, $4$, $9$, $5$, $10$

c.
Suppose $A$ is $k$-sorted. Then $$\frac{\sum_{j=i}^{i+k-1}{A[j]}}{k} \leq \frac{\sum_{j=i+1}^{i+k}{A[j]}}{k} \iff \sum_{j=i}^{i+k-1}{A[j]} \leq \sum_{j=i+1}^{i+k}{A[j]} \iff 0 \leq A[i+k] - A[i] \iff A[i] \leq A[i+k]$$ for all $i = 1, 2, ..., n-1$.

d.
As our disscussion in c. An array $A$ is $k$-sorted if and only if $A[i] \leq A[i+k]$ for all $i = 1, 2, ..., n-1$.

Algorithm($A$, $n$, $k$):
  Partition the $n$ elements into $k$ groups of $\lceil \frac{n}{k} \rceil$ elements each($G_j[1 ... \lceil \frac{n}{k} \rceil]$, $j=1, ... , k-1$),
    and the last group made up of $n - (k-1)\lceil \frac{n}{k} \rceil$ elements.($G_j[1 ... (k-1)\lceil \frac{n}{k} \rceil]$, $j = k$)
  for $j$ from $1$ to $k$:
    Sort $G_j$
  for $j$ from $1$ to $k$:
    for $l$ from $1$ to $G_j.length$:
      $A[k(j-1)+l] := G_j[l]$

Evaluate the time complexity of our algorithm,
 (i) partition part takes $O(n)$ time
 (ii) the first for-loop takes $O(k \cdot \lceil \frac{n}{k} \rceil \lg {\lceil \frac{n}{k} \rceil}) = O(n \lg{\frac{n}{k}})$
 (iii) the second for-loop takes $O(n)$ since there are $n$ elements in total.

Hence, the time complexity is $O(n \lg{\frac{n}{k}})$.

f.
Let $k$ be constant. By our disscussion in c. , we have to sort a subsequence $A[1+0k]$, $A[1+1k]$, $...$ , $A[1+(\lfloor \frac{n}{k} \rfloor -1)k]$, and it takes $\Omega(\lfloor \frac{n}{k} \rfloor \lg{\lfloor \frac{n}{k} \rfloor}) = \Omega(n \lg n)$ since $k$ is constant. This completes the proof.


9.3-1

For groups of 7:
Consider the same algorithm by replacing with "groups of 7", the number of elements $\geq x$ is $4( \lceil \frac{1}{2} \lceil \frac{n}{7} \rceil \rceil -2 ) \geq \frac{2n}{7} -8$. Then we have the recurrence relation of time consumption: $$T(n) \leq T(\lceil \frac{n}{7} \rceil) + T(\frac{5n}{7} + 8) + O(n)$$ Prove $T(n) = O(n)$ by substitution method:
Guess $T(n) = O(n)$, then by induction hypothesis, we have
$T(n) \leq c \lceil \frac{n}{7} \rceil + c(\frac{5n}{7} + 8) + an$
$\leq c \frac{n}{7} + c + c(\frac{5n}{7} + 8) + an$
$\leq (\frac{6c}{7}+a)n + 9c$
$\implies$ By choosing $c > 14a$ and $n \geq n_0 \geq \frac{9c}{14}$, therefore we have $T(n) \leq cn$ and hence $T(n) = O(n)$ by induction.

For groups of 3:
Suppose $T(n) = O(n)$, then $T(n) \leq T(\lceil \frac{n}{3} \rceil) + T(\frac{2n}{3} + 4) + O(n) \leq c \frac{n}{3} + c + c(\frac{2n}{3} + 4) + an$ is impossible to be $\leq cn$.


9.3-8

Design:
Algorithm($X$, $Y$, $n$):
  $p_x := 1$, $p_y := 1$
  $q_x := n$, $q_y := n$

  while( True ):
    $m_x := \lfloor \frac{p_x+q_x}{2} \rfloor$, $m_y := \lfloor \frac{p_y+q_y}{2} \rfloor$
    if $X[m_x] > Y[m_y]$:
      if $p_x$ == $q_x$:
        Return $Y[m_y]$
      $q_x := m_x$
      $p_y := m_y$
    else:
      if $p_y$ == $q_y$:
        Return $X[m_x]$
      $p_x := m_x$
      $q_y := m_y$

Correctness:
We are gonna use loop invariant to prove correctness.
Loop Invariant:
  At the start of each iteration, the median(we will use "the Median" for latter discussion) of all $2n$ elements in arrays $X$ and $Y$ is contained in $X[p_x...q_x] \cup Y[p_y...q_y]$. $(*)$
Initialization:
  When the while-loop starts, $X[p_x...q_x] = X[1...n]$ and $Y[p_y...q_y] = Y[1...n]$. $\implies (*)$ holds.
Maintenance:
  Suppose $(*)$ holds for current iteration.
  $\because$ $X$ and $Y$ are sorted.
  $\therefore$ Any subarray of $X$, $Y$ are sorted also. Note that the Median must lie between the median of $X[p_x...q_x]$ and the median of $Y[p_y...q_y]$,
   ie. $\min\{ X[\lfloor \frac{p_x+q_x}{2} \rfloor], Y[\lfloor \frac{p_y+q_y}{2} \rfloor] \} \leq$ the Median $\leq \max\{ X[\lfloor \frac{p_x+q_x}{2} \rfloor], Y[\lfloor \frac{p_y+q_y}{2} \rfloor] \}$.
  Suppose $(*)$ holds for current iteration, that is, the Median is contained in $X[p_x...q_x] \cup Y[p_y...q_y]$.
   If $X[m_x] > Y[m_y]$, then the removal of $Y[p_y ... m_y-1] \cup X[m_x+1 ... q_x]$ keeps $(*)$ hold for next iteration.
   If $X[m_x] \leq Y[m_y]$, then the removal of $X[p_x ... m_x-1] \cup Y[m_y+1 ... q_y]$ keeps $(*)$ hold for next iteration.
Termination:
  Note that $p_x$ == $q_x \iff p_y$ == $q_y$ since $q_x-p_x$ always equals to $q_y-p_y$.
  Therefore, if $p_x$ == $q_x$ or $p_y$ == $q_y$, then it only remains $X[p_x]$ and $Y[p_y]$ two elements, then we return the smaller one.

Because it has $2n$ elements and we removes even number elements in each iteration, it must terminate and remain only two elements in the final iteration, the smaller one must be the Median.

Analysis:
In each iteration, $q_x-p_x$ == $q_y-p_y$, it removes $\lfloor\frac{q_x - p_x}{2}\rfloor + \lceil\frac{q_y - p_y}{2}\rceil = \lfloor\frac{q_y - p_y}{2}\rfloor + \lceil\frac{q_x - p_x}{2}\rceil = q_x - p_x = q_y - p_y$ elements of $X[p_x...q_x] \cup Y[p_y...q_y]$.
$\implies$ It remains half of remainder of previous iteration.
$\implies$ Since the time consumption of each iteration is constant, the algorithm's time complexity is $O(\lg n)$.